Identification and Analysis of PEPC Gene Family Reveals Functional Diversification in Orchidaceae and the Regulation of Bacterial-Type PEPC

Phosphoenolpyruvate carboxylase (PEPC) gene family plays a crucial role in both plant growth and response to abiotic stress. Approximately half of the Orchidaceae species are estimated to perform CAM pathway, and the availability of sequenced orchid genomes makes them ideal subjects for investigating the PEPC gene family in CAM plants. In this study, a total of 33 PEPC genes were identified across 15 orchids. Specifically, one PEPC gene was found in Cymbidium goeringii and Platanthera guangdongensis; two in Apostasia shenzhenica, Dendrobium chrysotoxum, D. huoshanense, Gastrodia elata, G. menghaiensis, Phalaenopsis aphrodite, Ph. equestris, and Pl. zijinensis; three in C. ensifolium, C. sinense, D. catenatum, D. nobile, and Vanilla planifolia. These PEPC genes were categorized into four subgroups, namely PEPC-i, PEPC-ii, and PEPC-iii (PTPC), and PEPC-iv (BTPC), supported by the comprehensive analyses of their physicochemical properties, motif, and gene structures. Remarkably, PEPC-iv contained a heretofore unreported orchid PEPC gene, identified as VpPEPC4. Differences in the number of PEPC homolog genes among these species were attributed to segmental duplication, whole-genome duplication (WGD), or gene loss events. Cis-elements identified in promoter regions were predominantly associated with light responsiveness, and circadian-related elements were observed in each PEPC-i and PEPC-ii gene. The expression levels of recruited BTPC, VpPEPC4, exhibited a lower expression level than other VpPEPCs in the tested tissues. The expression analyses and RT-qPCR results revealed diverse expression patterns in orchid PEPC genes. Duplicated genes exhibited distinct expression patterns, suggesting functional divergence. This study offered a comprehensive analysis to unveil the evolution and function of PEPC genes in Orchidaceae.

The PEPC gene family plays a vital role in both monocot and dicot species, and its characterization has been extensively studied in various plants, including Arabidopsis thaliana [8], Saccharum spp.[14], Oncidiinae spp.[3], Kalanchoë spp.[15], and other plants.Four PEPCs have been identified in A. thaliana with three plant-type phosphoenolpyruvate carboxylase (PTPC) and one bacterial-type phosphoenolpyruvate carboxylase (BTPC) [8], and ten in Glycine max with seven PTPC and three BTPC [16].Five PEPCs were identified in sedges, with the recruitment of the ppc-1 gene into the C 4 pathway [17].Comprehensive genomic and transcriptomic analyses indicated that a PEPC gene (Dca1b) might be associated with CAM in Dendrobium catenatum, as evidenced by its higher expression in major photosynthetic tissues [18].Numerous functional studies of PEPC genes have also been reported.Setaria viridis, characterized by the low expression levels of PEPC, grows slowly even under high concentrations of CO 2 [19].The potential functions of PEPC were studied in response to abiotic stresses, such as salt, cold, and drought.The cis-regulatory elements related to various abiotic stresses were identified in the promoter regions of PEPC in soybean, and their transcript abundance and enzyme activities were altered by aluminum, cold, salt, and other stress [16].Under the treatment of 25 µmol AlCl 3 (pH 4.3) for a duration of 24 h, the PEPC activity of soybean leaves increased slightly, while that of roots increased first and then decreased [16].The overexpression of PEPC significantly enhanced photosynthetic efficiency and tolerance to environmental abiotic stresses [20][21][22].
Crassulacean acid metabolism (CAM) is one of the important carbon fixation pathways due to its higher water-use efficiency and drought tolerance [9,23].PEPC, the key enzyme for the CAM pathway at night, fascinated the researchers.Based on the transcriptomic dataset, the PEPC gene family has been identified in Oncidiinae, including C 3 , weak-CAM, and strong-CAM plants, and the number of PEPCs increased gradually along with the type of photosynthesis [3].The PEPCs have also been investigated based on genomic analyses, such as three PEPCs in pineapple [24] and two in Cymbidium mannii [25].The diel expression patterns of PEPC genes have been reported in many CAM plants [24][25][26][27].In C. mannii, a copy of the PEPC gene (PPC1;3) exhibited a markedly higher expression level than other copies and displayed rhythmic expression and day-night differences in protein abundance, implying that this copy played a dominant role in the fixation of CO 2 [25].Moreover, the transgenic Kalanchoë laxiflora with a loss of the PEPC gene significantly reduced the nocturnal CO 2 fixation and malate accumulation, and perturbations in stomatal closure during the light period were observed [15].

Identification and Phylogenetic Analysis of PEPC Genes
In total, 33 PEPC genes were identified across 15 orchid genomes, and the number of PEPC genes ranged from one to three in different species (one in C. goeringii and Pl.guangdongensis; two in A. shenzhenica, D. chrysotoxum, D. huoshanense, G. elata, G. menghaiensis, Ph. aphrodite, Ph. equestris, and Pl.zijinensis; three in C. ensifolium, C. sinense, D. catenatum, D. nobile, and V. planifolia).The new names of the predicted PEPC genes are listed in Table 1.For a more detailed analysis of the 33 PEPCs, their physicochemical properties were predicted using ExPASy (https://web.expasy.org/protparam/,accessed on 8 August 2023).The results showed that all PEPCs shared similar physicochemical properties (Table 1).These PEPC sequences exhibited slight variation in the amino acid lengths, ranging from 834 aa (CsPEPC1) to 1,076 aa (CePEPC2), with an average length of 960 aa.The molecular weights (MW) varied from 95.42 kDa (CsPEPC1) to 122.43 kDa (CePEPC2), with an average MW of 109.48 kDa.The theoretical isoelectric point ranged from 5.71 (AsPEPC3 and PgPEPC3) to 6.64 (DePEPC1), with an average of 6.03.All PEPCs were regarded as acidic (theoretical isoelectric point < 7) and hydrophilic (grand average of hydropathicity < 0) proteins (Table 1).Subcellular localization prediction suggested that all PEPCs are likely only located in the cytoplasm.To explore the evolution of PEPC genes among the 15 orchid genomes, phylogenetic trees were constructed using the neighbor-joining (NJ) method.The multiple protein sequence alignment of the 1297 aa PEPC fragments from 33 PEPC proteins showed 346 conserved sites, 698 variable sites, 281 parsimony-informative sites, and 364 singleton sites.The analysis demonstrated that the 33 orchid PEPC proteins are single origin and can be divided into PTPC and BTPC, with 32 and 1 members in the orchids, respectively (Figure 1).BTPC showed a distant branch with PTPC.Further classification of 33 PEPCs was divided into four subgroups: PEPC-i, PEPC-ii and PEPC-iii (PTPC), and PEPC-iv (BTPC).PEPC-iii (13 members) contained the most complete PEPC members of 15 orchids, followed by PEPC-i (12 members).Notably, this study revealed that PEPC-iv contained an orchid PEPC from V. planifolia (VpPEPC4), which has not been reported before.To explore the evolution of PEPC genes among the 15 orchid genomes, phylogene trees were constructed using the neighbor-joining (NJ) method.The multiple protein s quence alignment of the 1297 aa PEPC fragments from 33 PEPC proteins showed 346 co served sites, 698 variable sites, 281 parsimony-informative sites, and 364 singleton site The analysis demonstrated that the 33 orchid PEPC proteins are single origin and can divided into PTPC and BTPC, with 32 and 1 members in the orchids, respectively (Figu 1).BTPC showed a distant branch with PTPC.Further classification of 33 PEPCs was d vided into four subgroups: PEPC-i, PEPC-ii and PEPC-iii (PTPC), and PEPC-iv (BTPC PEPC-iii (13 members) contained the most complete PEPC members of 15 orchids, fo lowed by PEPC-i (12 members).Notably, this study revealed that PEPC-iv contained orchid PEPC from V. planifolia (VpPEPC4), which has not been reported before.

Motif and Gene Structure Analysis of PEPC Genes
In this study, a detailed analysis of PEPC proteins was conducted using the MEME program to identify motif patterns, with a predefined upper limit of 20 motifs (Figure 2B).The proteins encoded by the PTPC were conserved and similar in motif patterns, with the number of PEPC motifs ranging from 14 to 21.The protein encoded by BTPC had 13 motifs.Both PTPC and BTPC shared an identical motif pattern partly, specifically the order of motifs 10, 6, 15, 8, 18, 5, 3, 7, and 1.However, some differences between the two types were observed.Motifs 13, 19, and 2 were only presented in PTPC.The number of BTPC motifs was lower than that in PTPC, indicating a discernible divergence between the two types.

Motif and Gene Structure Analysis of PEPC Genes
In this study, a detailed analysis of PEPC proteins was conducted using the MEME program to identify motif patterns, with a predefined upper limit of 20 motifs (Figure 2B).The proteins encoded by the PTPC were conserved and similar in motif patterns, with the number of PEPC motifs ranging from 14 to 21.The protein encoded by BTPC had 13 motifs.Both PTPC and BTPC shared an identical motif pattern partly, specifically the order of motifs 10, 6, 15, 8, 18, 5, 3, 7, and 1.However, some differences between the two types were observed.Motifs 13, 19, and 2 were only presented in PTPC.The number of BTPC motifs was lower than that in PTPC, indicating a discernible divergence between the two types.To obtain deeper insights into the potential structural evolution of PEPCs, we conducted a comparative analysis of their exon-intron composition (Figure 2C).Different exon-intron structures were observed between the two types, but no discernible differences were observed within clades.PTPC consisted of 9-12 exons and 8-11 introns, whereas BTPC exhibited a more complex structure with 20 exons and 19 introns.While a similarity in the gene structure was found within each clade, orchid PEPCs demonstrated a substantial degree of variability in intron length and numbers in comparison with A. thaliana.Most orchid PEPCs exhibited longer intron than A. thaliana.Some orchid PEPCs exhibited variations in the number of introns; for instance, CePEPC2 has 11 introns, whereas CsPEPC1, CsPEPC3, and PzPEPC3 have only eight introns.

Cis-Elements in the Promoter Regions of PEPC Genes
To investigate the regulatory functions of the orchid PEPCs, we retrieved the 2000 bp promoter regions of six orchids to identify putative cis-acting regulatory elements (CREs).A comprehensive analysis unveiled a total of 402 CREs, including 45 types and 21 responsive functions (Figure 3 and Supplementary Table S1).Cis-element functions included developmental elements such as light responsiveness, endosperm expression, meristem To obtain deeper insights into the potential structural evolution of PEPCs, we conducted a comparative analysis of their exon-intron composition (Figure 2C).Different exon-intron structures were observed between the two types, but no discernible differences were observed within clades.PTPC consisted of 9-12 exons and 8-11 introns, whereas BTPC exhibited a more complex structure with 20 exons and 19 introns.While a similarity in the gene structure was found within each clade, orchid PEPCs demonstrated a substantial degree of variability in intron length and numbers in comparison with A. thaliana.Most orchid PEPCs exhibited longer intron than A. thaliana.Some orchid PEPCs exhibited variations in the number of introns; for instance, CePEPC2 has 11 introns, whereas CsPEPC1, CsPEPC3, and PzPEPC3 have only eight introns.

Cis-Elements in the Promoter Regions of PEPC Genes
To investigate the regulatory functions of the orchid PEPCs, we retrieved the 2000 bp promoter regions of six orchids to identify putative cis-acting regulatory elements (CREs).A comprehensive analysis unveiled a total of 402 CREs, including 45 types and 21 responsive functions (Figure 3 and Supplementary Table S1).Cis-element functions included developmental elements such as light responsiveness, endosperm expression, meristem expression, and circadian control; phytohormone responsiveness for abscisic acid (ABA), auxin, gibberellin, methyl jasmonate (MeJA), and salicylic acid; stress responsiveness such as anoxic and low-temperature (Figure 3A).Each PEPC gene harbored multiple types of elements, with light responsiveness (186, 46.3%) emerging as the most prevalent functional category.MeJA-responsiveness element (50, 12.4%) ranked as the second most abundant, followed by abscisic acid responsiveness (30, 7.5%).Among these elements, Box4 comprised the most common elements (64, 15.9%) present in each PEPC, followed by G-Box (33, 8.2%) (Figure 3B).Elements such as ABRE, G-box, and circadian elements were closely associated with the regulation of genes involved in the circadian rhythm.In this study, ABRE and G-box elements were present in each PEPC gene at PEPC-i and PEPC-ii, except for PaPEPC1.In contrast, the circadian element was only present in DnPEPC2.
auxin, gibberellin, methyl jasmonate (MeJA), and salicylic acid; stress responsiveness such as anoxic and low-temperature (Figure 3A).Each PEPC gene harbored multiple types of elements, with light responsiveness (186, 46.3%) emerging as the most prevalent functional category.MeJA-responsiveness element (50, 12.4%) ranked as the second most abundant, followed by abscisic acid responsiveness (30, 7.5%).Among these elements, Box4 comprised the most common elements (64, 15.9%) present in each PEPC, followed by G-Box (33, 8.2%) (Figure 3B).Elements such as ABRE, G-box, and circadian elements were closely associated with the regulation of genes involved in the circadian rhythm.In this study, ABRE and G-box elements were present in each PEPC gene at PEPC-i and PEPC-ii, except for PaPEPC1.In contrast, the circadian element was only present in DnPEPC2.

Chromosomal Localization and Collinearity Analysis of PEPC Genes
We conducted gene localization on chromosomes and gene duplication analysis to elucidate the homologous relationships among genes across the 15 orchid species.Each PEPC gene was situated on distinct chromosomes within all species (Figure 4).A total of five gene pairs were identified within species (CePEPC1 and CePEPC2, CsPEPC1 and CsPEPC2, DcPEPC1 and DcPEPC2, DhPEPC1 and DhPEPC2, and DnPEPC1 and

Chromosomal Localization and Collinearity Analysis of PEPC Genes
We conducted gene localization on chromosomes and gene duplication analysis to elucidate the homologous relationships among genes across the 15 orchid species.Each PEPC gene was situated on distinct chromosomes within all species (Figure 4).A total of five gene pairs were identified within species (CePEPC1 and CePEPC2, CsPEPC1 and CsPEPC2, DcPEPC1 and DcPEPC2, DhPEPC1 and DhPEPC2, and DnPEPC1 and DnPEPC2), and were regarded as duplicated genes.The scattered distribution of PEPC genes on the chromosomes indicated that the duplicated genes in each orchid may result from segmental duplication.
DnPEPC2), and were regarded as duplicated genes.The scattered distribution of PEPC genes on the chromosomes indicated that the duplicated genes in each orchid may result from segmental duplication.To study the evolutionary regulation of the orchid PEPC gene family, MCScanX was used to identify duplicated gene pairs.The collinear relationships among the 15 PEPC genes of C. ensifolium, D. chrysotoxum, D. huoshanense, D. nobile, G. menghaiensis, and V. planifolia were examined (Figure 5).Duplicated genes intra-or inter-species were investigated, and 38 gene pairs were obtained.To explore the different selective constraints on duplicated PEPC genes in six orchids, the 38 gene pairs were selected for calculating the ratio of the number of non-synonymous substitutions per non-synonymous site (Ka) to the number of synonymous sites (Ks).The results showed that the Ka/Ks ratios of all PEPC gene pairs were less than 1, implying that the orchid PEPC genes mainly experienced strong purifying selection after segmental duplication or WGD (Supplementary Table S2).
To study the evolutionary regulation of the orchid PEPC gene family, MCScanX was used to identify duplicated gene pairs.The collinear relationships among the 15 PEPC genes of C. ensifolium, D. chrysotoxum, D. huoshanense, D. nobile, G. menghaiensis, and V. planifolia were examined (Figure 5).Duplicated genes intra-or inter-species were investigated, and 38 gene pairs were obtained.To explore the different selective constraints on duplicated PEPC genes in six orchids, the 38 gene pairs were selected for calculating the ratio of the number of non-synonymous substitutions per non-synonymous site (Ka) to the number of synonymous sites (Ks).The results showed that the Ka/Ks ratios of all PEPC gene pairs were less than 1, implying that the orchid PEPC genes mainly experienced strong purifying selection after segmental duplication or WGD (Supplementary Table S2).

Expression Analysis of PEPC Genes
To elucidate the function of PEPC genes in orchids, the spatial and temporal expression patterns of PEPC genes in orchids were conducted based on transcriptome datasets.Here, we focused on the developmental seed transcriptome of Ph. equestris, spanning the 4, 7, and 12 days (Figure 6A).The temporal analysis unveiled dynamic expression patterns; the expression of PePEPCs was upregulated from the fourth day to the seventh day, followed by a subsequent downregulation on the 12th day.Notably, PePEPC3 consistently exhibited higher expression levels than PePEPC1.In addition, the developmental seed expression of V. planifolia within six, eight, and ten weeks, as well as three, five, and six-month pods were investigated based on the transcriptome data (Figure 6B).Distinct expression patterns of VpPEPCs in seeds were demonstrated.VpPEPC3 was most abundantly expressed, while VpPEPC4, a BTPC, was barely expressed.VpPEPC3 demonstrated an upward trend from six weeks to eight weeks, followed by a subsequent decline, and a slight rise in the six months.These results collectively underscored the intricate dynamics of PEPC gene expression and emphasized their important roles in seed germination.
nobile were shown with different colors and labeled as Gm, Vp, Ce, De, Dh, and Dn, respectively.The gene pairs among different species are shown with different colors.

Expression Analysis of PEPC Genes
To elucidate the function of PEPC genes in orchids, the spatial and temporal expression patterns of PEPC genes in orchids were conducted based on transcriptome datasets.Here, we focused on the developmental seed transcriptome of Ph. equestris, spanning the 4, 7, and 12 days (Figure 6A).The temporal analysis unveiled dynamic expression patterns; the expression of PePEPCs was upregulated from the fourth day to the seventh day, followed by a subsequent downregulation on the 12th day.Notably, PePEPC3 consistently exhibited higher expression levels than PePEPC1.In addition, the developmental seed expression of V. planifolia within six, eight, and ten weeks, as well as three, five, and sixmonth pods were investigated based on the transcriptome data (Figure 6B).Distinct expression patterns of VpPEPCs in seeds were demonstrated.VpPEPC3 was most abundantly expressed, while VpPEPC4, a BTPC, was barely expressed.VpPEPC3 demonstrated an upward trend from six weeks to eight weeks, followed by a subsequent decline, and a slight rise in the six months.These results collectively underscored the intricate dynamics of PEPC gene expression and emphasized their important roles in seed germination.The spatial expression patterns of PEPC genes in A. shenzhenica, D. catenatum, Ph aphrodite, and V. planifolia were analyzed to investigate gene biological functions and functional diversity (Figure 7).The expression results revealed that the PEPC gene exhibited widespread expressions in both vegetative tissues (leaf, root, and stem) and reproductive tissues (flower, pollinium, and seed).These findings indicated a diverse array of biological functions in growth and development associated with PEPC genes.Various expression patterns were observed for PEPC genes in different tissues.In A. shenzhenica, AsPEPC3 displayed prominent expressions in sequenced tissues, while AsPEPC2 exhibited lower expressions in all tissues (Figure 7A).The pronounced expression of AsPEPC3 in floral tissues (inflorescence and pollinium) and vegetative tissues (root, stem, and tuber), suggested its involvement in multiple aspects of plant growth.In D. catenatum, DcPEPC1 exhibited the highest expression levels in the stem and leaf, while DcPEPC2 showed minimal expressions (Figure 7B), indicating the function divergence between duplicated PEPC genes.DcPEPC3 exhibited high expressions in all the tested tissues, but lower than the expression level of DcPEPC1 in the stem and leaf.In V. planifolia, the bacterial-type PEPC (VpPEPC4), not reported before, exhibited lower expressions than other VpPEPCs in all the tested tissues (Figure 7C).VpPEPC2 displayed the highest expression level in the leaf, stem, root, and mesocarp of the pod.In Ph aphrodite, PaPEPC1 showed the highest expression in the leaf, followed by the flower, and lowest in the root, pollinia, flower bud, and stalk (Figure 7D).PaPEPC3 demonstrated a relatively low expression level in tested tissues.The spatial expression patterns of PEPC genes in A. shenzhenica, D. catenatum, Ph aphrodite, and V. planifolia were analyzed to investigate gene biological functions and functional diversity (Figure 7).The expression results revealed that the PEPC gene exhibited widespread expressions in both vegetative tissues (leaf, root, and stem) and reproductive tissues (flower, pollinium, and seed).These findings indicated a diverse array of biological functions in growth and development associated with PEPC genes.Various expression patterns were observed for PEPC genes in different tissues.In A. shenzhenica, AsPEPC3 displayed prominent expressions in sequenced tissues, while AsPEPC2 exhibited lower expressions in all tissues (Figure 7A).The pronounced expression of AsPEPC3 in floral tissues (inflorescence and pollinium) and vegetative tissues (root, stem, and tuber), suggested its involvement in multiple aspects of plant growth.In D. catenatum, DcPEPC1 exhibited the highest expression levels in the stem and leaf, while DcPEPC2 showed minimal expressions (Figure 7B), indicating the function divergence between duplicated PEPC genes.DcPEPC3 exhibited high expressions in all the tested tissues, but lower than the expression level of DcPEPC1 in the stem and leaf.In V. planifolia, the bacterial-type PEPC (VpPEPC4), not reported before, exhibited lower expressions than other VpPEPCs in all the tested tissues (Figure 7C).VpPEPC2 displayed the highest expression level in the leaf, stem, root, and mesocarp of the pod.In Ph aphrodite, PaPEPC1 showed the highest expression in the leaf, followed by the flower, and lowest in the root, pollinia, flower bud, and stalk (Figure 7D).PaPEPC3 demonstrated a relatively low expression level in tested tissues.In summary, orchid PEPC genes showed key functions in various tissues, including seed germination, photosynthetic function, floral development, and root elongation.
In summary, orchid PEPC genes showed key functions in various tissues, including seed germination, photosynthetic function, floral development, and root elongation.

RT-qPCR of PEPC Genes
RT-qPCR was employed to corroborate the expression profiles of three PEPC genes (DcPEPC1, DcPEPC2, and DcPEPC3) in the stem, leaf, and root tissues.The results showed that DcPEPC1 exhibited high expressions in the leaf, followed by the stem and root (Figure 8, Supplementary Table S5).DcPEPC3 exhibited the highest expression levels in the root, followed by the stem and leaf.These results aligned with the expression patterns observed in the transcriptome data.However, a slight difference in the expression pattern of De-PEPC2 was observed between RT-qPCR and transcriptome data.In RT-qPCR, DePEPC2 demonstrated the highest expression level in the root, while transcriptome data indicated a higher expression in the leaf.The expression profiles of three genes in three tissues, as determined by RT-qPCR, were largely consistent with the transcriptome data, substantiating the accuracy of the aforementioned expression patterns.

Identification and Phylogenetic Analysis of PEPC Genes
The enzymes encoded by PEPC genes are responsible for primary CO2 fixation into OAA and malate [46].The PEPC gene family plays a vital role in plant growth and development and has been studied in several plant families, including Crassulaceae, Fabaceae, and Poaceae [16,26,47].However, the systematic identification or functional reports of the PEPC gene family in Orchidaceae based on the genome data remain limited.This study identified 33 PEPC genes from 15 orchid genomes that contained both C3 and CAM plants [48][49][50][51] (Table 1).The results showed that each species contained one to three PEPC homolog genes, fewer than the counts found in Gossypium [52], wheat, and sorghum [10].In

RT-qPCR of PEPC Genes
RT-qPCR was employed to corroborate the expression profiles of three PEPC genes (DcPEPC1, DcPEPC2, and DcPEPC3) in the stem, leaf, and root tissues.The results showed that DcPEPC1 exhibited high expressions in the leaf, followed by the stem and root (Figure 8, Supplementary Table S5).DcPEPC3 exhibited the highest expression levels in the root, followed by the stem and leaf.These results aligned with the expression patterns observed in the transcriptome data.However, a slight difference in the expression pattern of De-PEPC2 was observed between RT-qPCR and transcriptome data.In RT-qPCR, DePEPC2 demonstrated the highest expression level in the root, while transcriptome data indicated a higher expression in the leaf.The expression profiles of three genes in three tissues, as determined by RT-qPCR, were largely consistent with the transcriptome data, substantiating the accuracy of the aforementioned expression patterns.

RT-qPCR of PEPC Genes
RT-qPCR was employed to corroborate the expression profiles of three PEPC genes (DcPEPC1, DcPEPC2, and DcPEPC3) in the stem, leaf, and root tissues.The results showed that DcPEPC1 exhibited high expressions in the leaf, followed by the stem and root (Figure 8, Supplementary Table S5).DcPEPC3 exhibited the highest expression levels in the root, followed by the stem and leaf.These results aligned with the expression patterns observed in the transcriptome data.However, a slight difference in the expression pattern of De-PEPC2 was observed between RT-qPCR and transcriptome data.In RT-qPCR, DePEPC2 demonstrated the highest expression level in the root, while transcriptome data indicated a higher expression in the leaf.The expression profiles of three genes in three tissues, as determined by RT-qPCR, were largely consistent with the transcriptome data, substantiating the accuracy of the aforementioned expression patterns.

Identification and Phylogenetic Analysis of PEPC Genes
The enzymes encoded by PEPC genes are responsible for primary CO2 fixation into OAA and malate [46].The PEPC gene family plays a vital role in plant growth and development and has been studied in several plant families, including Crassulaceae, Fabaceae, and Poaceae [16,26,47].However, the systematic identification or functional reports of the PEPC gene family in Orchidaceae based on the genome data remain limited.This study identified 33 PEPC genes from 15 orchid genomes that contained both C3 and CAM plants [48-51] (Table 1).The results showed that each species contained one to three PEPC homolog genes, fewer than the counts found in Gossypium [52], wheat, and sorghum [10].In

Identification and Phylogenetic Analysis of PEPC Genes
The enzymes encoded by PEPC genes are responsible for primary CO 2 fixation into OAA and malate [46].The PEPC gene family plays a vital role in plant growth and development and has been studied in several plant families, including Crassulaceae, Fabaceae, and Poaceae [16,26,47].However, the systematic identification or functional reports of the PEPC gene family in Orchidaceae based on the genome data remain limited.This study identified 33 PEPC genes from 15 orchid genomes that contained both C 3 and CAM plants [48-51] (Table 1).The results showed that each species contained one to three PEPC homolog genes, fewer than the counts found in Gossypium [52], wheat, and sorghum [10].In contrast to the varying tendency of PEPC numbers in Oncidiinae [3], the quantity of PEPC appeared to be unrelated to the type of photosynthesis in this study.This suggested that the gene dosage effect played a minor role in the CAM pathway, consistent with other perspectives [18,43].
A total of 33 PEPC proteins from 15 orchid genomes and four PEPC proteins from A. thaliana were combined to construct a phylogenetic tree.The result showed that orchid PEPCs were divided into two subfamilies: PTPC and BTPC.Further classification of the PTPC revealed three subclades (Figure 1).The phylogenetic relationship was consistent with previous studies [3,18,52].Both phylogenetic and sequence analyses converged to suggest a single origin prior to the divergence of bacteria and plant lineages (Figure 1, Table 1).PEPC genes with different coding types in plants evolved independently and showed distant relationships (Figure 1).PEPC-iii contained 13 PEPC members, including most orchid species, followed by PEPC-i, implying that compared with PEPC-ii and PEPC-iv, the functional characteristics of PEPC in these two subgroups were relatively conserved.The results indicated numerous duplication and loss events during the evolutionary trajectory, contributing to the variations in the numbers of PEPC homolog genes among different species [18].
In plants, the PEPC gene family mainly consists of PTPC and BTPC.Most species always recruit PTPC, whereas some species recruit BTPC.However, these genes were usually expressed at low levels or exhibited higher expression levels in non-photosynthetic tissues [53].Notably, Isoetes taiwanensis has been documented to recruit BTPC, exhibiting expression levels surpassing those of PTPC [27].This study highlighted the presence of VpPEPC4 within the PEPC-iv (Figure 1), a novel finding yet to be reported in orchids.However, conclusive evidence regarding the functional specialization of this PEPC gene within CAM in V. planifolia requires validation through further transcriptomic studies.

Motif and Gene Structure of PEPC Genes
The motif and domain analysis revealed the high conservation of PEPC genes in orchids (Figure 2), consistent with other species, such as maize and sorghum [10].The gene structure analysis revealed that the number of introns and exons in most orchid PEPC genes was similar to AtPEPCs but with some slight variations in the intron length.The gene family of orchids exhibited a longer intron length than the homolog genes in A. thaliana, which was reported previously [36,54].The prevalence of longer introns in Orchidaceae likely represented a distinctive feature of this plant family.CePEPC2 and PgPEPC3 had unusually long introns (Figure 2C), suggesting that extremely long intron length might be attributed to species-specific evolution.The preference for longer intron over shorter counterparts was posited to release Hill-Robertson (HR) interference to enhance the efficiency of natural selection [55], which might be responsible for the biodiversity of Orchidaceae.

Cis-Elements in the Promoter Regions of PEPC Genes
Gene expression levels tend to be regulated by the cis-elements of promoter regions [56].In this study, we identified the cis-elements within the 2000 bp promoter region of 17 PEPC genes from seven orchids (Figure 3).Our results showed a multitude of cis-elements implicated in light responsiveness, indicating the important role of light as a regulatory factor influencing PEPC gene functions.Many elements interacting with MYB transcription factors were identified.These MYB-binding sites were implicated in response to light and drought, suggesting the potential involvement of MYB transcription factors in regulating PEPC expression levels under stress conditions.Similar findings were found in C 4 species [10].This study suggested that the expression profile of the PEPC gene was regulated by multiple factors.Furthermore, compared with PEPC-iii, the promoter regions of PEPC genes at PEPC-i and PEPC-ii contained cis-elements associated with the circadian rhythm, implying that these two subgroups might be involved in photosynthesis.

Chromosomal Localization and Collinearity Analysis of PEPC Genes
Gene duplication events are a crucial factor in gene expansion or structural and functional divergence [57].Using collinearity analysis, 38 gene pairs were identified (Figure 5), potentially attributed to WGD or segmental duplication, as evidenced by their chromosomal location [18,52].The multiplicity of copies among different species ranged from one to three, indicative of probable duplication and loss events during the evolution [18].Subsequently, these gene pairs were used for Ka/Ks analysis.The Ka/Ks ratio is essential for exploring genomic evolution [58], which can serve as an indicator of purifying selection (Ka/Ks < 1), neutral mutation (Ka/Ks = 1), or positive selection (Ka/Ks > 1).Purifying selection is recognized for its role in preserving existing biological functions [59].Our study found that all the examined gene pairs exhibited Ka/Ks ratios less than 1 (Supplementary Table S2), suggesting that these PEPCs underwent exceedingly purifying selection, contributing to their high conservation in orchids.

Expression Analysis and RT-qPCR of PEPC Genes
During the extensive evolutionary history of orchid PEPC genes, duplicated genes may experience functional divergence [60].Some PEPC genes in PEPC-i and PEPC-ii were identified as duplicated genes (Figures 1 and 4).In D. catenatum, DcPEPC1 demonstrated elevated expression levels, while its duplicated gene DcPEPC2 exhibited minimal expression in all sequenced tissues (Figure 7B).The distinct expression patterns observed between the two genes suggested that the duplicated genes may experience neofunctionalization [52,60].
In V. planifolia, VpPEPC4 was considered a bacterial-type PEPC gene based on phylogenetic analysis (Figure 1).However, the transcriptome data analysis showed that VpPEPC4 was barely expressed in all the tested tissues (Figure 7C), contrasting with the high expression of BTPC in the green tissues of I. taiwanensis [27].VpPEPC2 exhibited elevated expression in the stem and leaves.In D. catenatum, DcPEPC1 displayed prominent expression in the stem and leaf, much higher than that in other tissues (Figure 7B).The RT-qPCR experiments of DcPEPC1 further verified these results (Figure 8).In Ph. aphrodite, PaPEPC1 showed the most abundant expression level in the stem (Figure 7D).The expression patterns suggested the potential recruitment of DcPEPC1, PaPEPC1, and VpPEPC2 in the CAM pathway.Therefore, combined with the prediction of cis-elements, we hypothesized that genes within the PEPC-i and PEPC-ii clade were CAM-related genes, supporting the previous report [18].

Homolog Gene Identification and Sequence Analysis
The PEPC protein sequences of the model plant A. thaliana were used to conduct a blast search for orthologs within orchids based on the genomic data, employing the BLAST module (Blastp) of TBtools (version 1.121) [63], hitting with e-values less than 1 × 10 −10 .The detailed information of BLAST result was provided in Supplementary Table S3.Subsequently, all protein sequences further underwent validation using the NCBI batch CD-search tool (https://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi, accessed on 7 October 2023) with a threshold of 0.01 to scrutinize the conserved domain (PF00311), and proteins lacking the complete domain were eliminated.The MEME Suite 5.5.1 online tool (https://meme-suite.org/meme/, accessed on 7 October 2023) [64] was used to analyze motif composition.The intron-exon structure was discerned by the Visualize Gene Structure program of TBtools (version 1.121) [63].Finally, the visualization of these results was accomplished using the Gene Structure View module of TBtools (version 1.121) [63].

Multiple Sequence Alignment and Phylogenetic Tree Construction
Multiple sequences were aligned using MUSCLE integrated into MEGA 7 with the default parameters [67].The phylogenetic tree of PEPCs was constructed using the neighborjoining (NJ) method in MEGA7 [67].Parameters included the 'Poisson model' and the 'Pairwise deletion' option with a bootstrap test of 1000 replicates.Finally, the phylogenetic tree was imported into iTOL (https://itol.embl.de/itol.cgi,accessed on 30 November 2023) [68] for refinement and polishing, employing circle and none-leaf sorting parameters.And Adobe Illustrator CC 2018 was used for supplementing clade labels.

Prediction of Cis-Acting Elements
The 2000 bp regions upstream of all PEPCs were extracted using Gtf/Gff3 Sequence Extract and Fasta Extract programs integrated into TBtools (version 1.121) [63].The online website PlantCARE (https://bioinformatics.psb.ugent.be/webtools/plantcare/html/,accessed on 18 October 2023) [69] was utilized for identifying the putative cis-acting elements in the promoter regions.The Basic Biosequence View module of TBtools (version 1.121) [63] was used to display the findings of cis-acting element annotation and counts.

Chromosomal Localization and Collinearity Analysis
After obtaining information on the chromosomal locations of the PEPC gene family across 15 orchids from the corresponding genome annotations, the chromosomal location map was generated using the Gene Location Visualize program in TBtools (version 1.121) [63].The collinear relationship among six orchids was delineated and visualized by the One Step MCScanX fast program of TBtools (version 1.121) [63].Subsequently, the Ka/Ks ratios of duplicated genes were calculated by the Simple Ka/Ks Calculator integrated into TBtools (version 1.121) [63], using the Nei and Gojobori (NG) method.If Ka > Ks or Ka/Ks > 1, the gene was generally considered to undergo positive selection; if Ka = Ks or Ka/Ks = 1, the gene was subject to neutral evolution; and if Ka < Ks or Ka/Ks < 1, the gene underwent purifying selection [70].

RT-qPCR
The plant materials used in this study were sourced from the National Orchid Germplasm Resources of Fujian Agriculture and Forestry University, Fuzhou, China.For RT-qPCR validation, three PEPC genes (DcPEPC1, DcPEPC2, and DcPEPC3) were selected.Total RNA extraction was carried out using the RaPure Total RNA Plus Kit (Magen Biotech Co., Ltd., Guangzhou, China), following the eighth method described in the kit manual for plant samples.Subsequently, cDNA synthesis was performed using the Hifair ® AdvanceFast 1st Strand cDNA Synthesis Kit (Yeasen Biotechnology, Shanghai, China).RT-qPCR analysis was conducted on the QuantStudio™ Real-Time PCR (Applied Biosystems, Waltham, MA, USA), employing the Hieff UNICON ® Universal Blue qPCR SYBR Green Master Mix (low rox) kit (Yeasen Biotechnology, Shanghai, China).The reaction system comprised a total volume of 20 µL, including 10 µL of Hieff UNICON Universal Blue qPCR SYBR Green Master Mix, 0.4 µL of forward primer (10 µM), 0.4 µL of reverse primer (10 µM), 2 µL of template DNA, and 7.2 µL of sterile ultrapure water.All experiments were conducted using three biological replicates and three technical replicates.
The Ct values obtained from RT-qPCR were analyzed using the 2 −∆∆CT formula to determine the relative expression levels of the three genes in different tissues (stem, leaf, and root).The Ct values of roots were used as the control for calculation, and DcGAPDH was used as the internal reference for normalization.Detailed information on RT-qPCR primers was provided in Supplementary Table S4, and RT-qPCR data details were presented in Supplementary Table S5.The results of RT-qPCR were visualized using GraphPad Prism 8.0.1 for Windows (GraphPad Software version 8.0.1,San Diego, CA, USA, www.graphpad.com, accessed on 10 January 2024).

Conclusions
In this study, a comprehensive analysis of the orchid PEPC gene family was conducted.A total of 33 orchid PEPC genes were identified from 15 orchid species.Domain, motif pattern, and gene structure analyses revealed high conservation among all orchid PEPC.These genes were classified into two clades, PTBC and BTPC, exhibiting highly similar characteristics within each respective clade.PTPC was further classified into three subclades.Remarkably, an orchid PEPC in BTPC (VpPEPC4) was identified for the first time.However, the expression level of VpPEPC4 was found to be lower than other VpPEPCs.The cis-acting element in light responsiveness acted in concert with the pivotal role of PEPCs in photosynthesis, and the circadian-related elements were found in most PEPC-i and PEPC-ii genes.The duplicated genes in D. catenatum showed distinct expression patterns and were validated using RT-qPCR results, indicating functional divergence in orchid evolution.These findings provided valuable insights into the characteristics and functions of PEPC in orchids.

Figure 1 .
Figure 1.Phylogenetic trees of PEPC proteins based on 33 orchid PEPC proteins and 4 AtPEPC pr teins.The different sizes of the circle on the nodes represent bootstrap percentages; the upper l

Figure 2 .
Figure 2. Gene structure, conserved motifs, and domains of PEPCs.(A) The NJ tree contains 33 orchid PEPCs.(B) Squares of different colors represent conserved motifs of PEPCs.(C) Squares of different colors represent the gene structures of PEPCs.

Figure 2 .
Figure 2. Gene structure, conserved motifs, and domains of PEPCs.(A) The NJ tree contains 33 orchid PEPCs.(B) Squares of different colors represent conserved motifs of PEPCs.(C) Squares of different colors represent the gene structures of PEPCs.

Figure 3 .
Figure 3. Cis-acting elements in the promoter regions of PEPC genes.(A) Functions of cis-acting elements in different orchid PEPCs.(B) Number of cis-acting elements in different orchid PEPCs.

Figure 3 .
Figure 3. Cis-acting elements in the promoter regions of PEPC genes.(A) Functions of cis-acting elements in different orchid PEPCs.(B) Number of cis-acting elements in different orchid PEPCs.

Figure 5 .
Figure 5. Location and orthologs or paralogs of 38 PEPC gene pairs intra-or inter-species of G. menghaiensis, V. planifolia, C. ensifolium, D. chrysotoxum, D. huoshanense, and D. nobile genomes.The chromosomes of G. menghaiensis, V. planifolia, C. ensifolium, D. chrysotoxum, D. huoshanense, and D. nobile were shown with different colors and labeled as Gm, Vp, Ce, De, Dh, and Dn, respectively.The gene pairs among different species are shown with different colors.

Figure 8 .
Figure 8. RT-qPCR analysis of three DcPEPCs genes in D. catenatum in different tissues (stem, leaf, and root).Error bars indicate the SD of three biological replicates.
Int. J. Mol.Sci.2024, 25, x FOR PEER REVIEW 10 of 17 In summary, orchid PEPC genes showed key functions in various tissues, including seed germination, photosynthetic function, floral development, and root elongation.

Figure 8 .
Figure 8. RT-qPCR analysis of three DcPEPCs genes in D. catenatum in different tissues (stem, leaf, and root).Error bars indicate the SD of three biological replicates.

8 .
RT-qPCR analysis of three DcPEPCs genes in D. catenatum in different tissues (stem, leaf, and root).Error bars indicate the SD of three biological replicates.

Table 1 .
Information of PEPC homologs in 15 orchid genomes and A. thaliana.